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ABSTRACT 

The present investigation extends the factor analytic 
approach pioneered by Fleishman (1964) , incorporating subsequent 
developments in the application of confirmatory factor analysis and 
the physical fitness literature (e.g., an emphasis on maximum oxygen 
intake). More specifically, the ability of an a priori factor 
structure of physical fitness to fit (i.e., account for) data from 
the 1985 Australian Health and Fitness Survey based on 25 indicators 
of fitness (field exercises, technical measures, and laboratory 
measures) is tested for 2,817 boys and girls aged 9, 12, and 15. An 
8-factor model derived from previous research fit the data well for 
each of the 6 age/gender groups considered separately. Based on tests 
of factorial invariance, factor loadings and factor correlations were 
reasonably invariant across the six groups. This substantively 
important finding indicates that all 25 indicators were equally valid 
for boys and girls aged 9, 12, and 15, The results provided clear 
support for the mult idimensi onal i ty of physical fitness and call into 
question attempts to summarize fitness t-rith a single indicator (e.g., 
aerobic power) or a total score representing different components of 
physical fitness. (Contains 43 references.) (Author/lAH) 
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Abstract 

The present investigation extends the factor analytic approach pioneered by Fleishman 
(1964), incorporating subsequent developments in the application of confirmatory factor analysis 
and the physical fitness literature (e.g., an emphasis on V0 2 max). More specifically, the ability 
of an a priori factor structure of physical fitness to fit (i.e., account for) data based on 25 
indicators of fitness (field exercises, technical measures, and laboratory measures) is tested for 
2,817 boys and girls aged 9, 12, and 15. An 8-factor model derived from previous research fit the 
data well for each of the 6 age/gender groups considered separately. Based on tests of factorial 
invariance, factor loadings and factor correlations were reasonably invariant across the six 
groups. This substantively important finding ir 'icates that all 25 indicators were equally valid 
for boys and girls aged 9, 12 and 15. The results provided clear support for the 
multidimensionality of physical fitness and call into question attempts to summarize fitness with 
a single indicator (e.g., V0 2 max) or a total score representing different components of physical 
fitness. 
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The Multidimensional Structure of Physical Fitness: In variance Over Gender and Age 

Physical fitness is a widely valued goal for men and women of all ages. Of particular 
relevance to the present investigation, there is a growing concern about youth fitness; young 
people's poor physical fitness, sedentary life style, and levels of obesity. Related concerns are 
also evident in the shift in emphasis in sport/exercise research and physical education from a 
narrow focus on sport and elite athlete^ towards a broader focus on health-related fitness and 
epidemiological studies of youth fitness. Physical fitness is also posited as a mediating variable 
that contributes to health-related outcomes, social skills, and a variety of measures of 
psychological well being. 

A Construct Validity Approach 

Despite the importance of the physical fitness construct, theoretical and empirically tested 
models of the structure of physical fitness have not been given adequate attention. On the one 
hand, there is a growing use of reasonably distinct, narrowly defined indicators of physical 
fitness without clarifying how they fit into the overall structure of physical fitness. On the other 
hand, there is an increasing number of physical fitness batteries based on implicit, typically 
untested assumptions about the structure of physical fitness and its generality across age, gender, 
and other individual characteristics. Because physical fitness is a hypothetical construct, its 
construct validity must be established. In a construct validity approach, investigations can be 
classified as within-construct studies that evaluate the internal structure of physical fitness using 
techniques such as factor analysis or b etween -con struct studies that attempt to establish a 
theoretically consistent, logical pattern of relations between measures of physical fitness and 
other constructs. The resolution of at least some within-construct issues should be a logical 
prerequisite to between-construct research. This emphasis on construct validity, factor analysis, 
and within-construct studies of the structure of physical fitness is the focus of the present 
investigation. 

Physical fitness tests are typically compared with age and gender norms or, perhaps, more 
sophisticated norms that also take into account body composition (height, weight, body fat) or 
biological maturity (e.g., Malina, 1989). Such comparisons, however, may confound the 
influences of skill, motivation, compensatory growth that reflects demands placed on the body by 
physical exertion, and genetically determined developmental growth (see discussion by 
Krahenbuhl, 1980). Of greater relevance to the present investigation, there has not been adequate 
attention given to the question of whether a given test measures the same component of physical 
fitness with equal validity for boys and for girls, and across different ages. This is a particularly 
relevant concern in tests of youth fitness during the early-adolescent period that is so potentially 
turbulent - biologically and psychologically. It is important to emphasize that this concern is not 
one of differences in levels of performance that can be evaluated in relation to appropriately 
constructed norms. Rather, the concern is more fundamental, asking whether the same physical 
fitness indicator has the same meaning across subjects who differ in age and gender. If the same 
indicators reflect different components of fitness depending on age and gender, then the task of 
interpreting each indicator and presenting a profile of different components of fitness would be 
considerably more complicated. Such a finding would also call into question many current 
practices in assessing fitness which implicitly assume that the underlying meaning of a particular 
indicator is relatively invariant. This critical concern can only be addressed within the context of 
a construct validity approach and an evaluation of whether - and how - the structure of physical 
fitness varies according to individual characteristics such as gender and age. This issue, because 
of its importance to theory and practice, and because it has been given little attention in the 
physical fitness literature, will be a major focus of the present investigation. 

In the physical fitness literature a distinction is typically made between the large sample 
epidemiology-like studies of youth fitness that rely primarily on easily administered field 
exercises that do not require expensive equipment and the small-sample laboratory studies of 
adult (or elite athlete) fitness that emphasize technically sophisticated measures which require 
expensive equipment. This distinction, however, invites the potential for confusing the 
indicators of physical fitness with the physical fitness construct and for confusing the , 
technological sophistication required to obtain a measure with the construct validity of a 
measure. The inexpensively collected field exercises should not be viewed as "poor cousins" of 
the more expensive laboratory measures and the technologically sophisticated measures are not 
necessarily more valid indicators of the physical fitness construct. The purpose of the field 
exercises is not to provide a necessarily imperfect prediction of the laboratory measures that 
could be achieved if only adequate resources were available to test all subjects in a laboratory 
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setting. Rather, both the field exercises and the laboratory measures are merely indicators of the 
physical fitness construct whose validity should be systematically evaluated within a construct 
validity approach. Hence there is a need to evaluate the structure of physical fitness in studies 
that include a wide array of field exercises and sophisticated laboratory measures, and that are 
based on sufficiently large samples to appropriately apply statistical techniques such as factor 
analysis. 

Physical Fitness: A Multidimensional Construct 

Critical concerns in the study of Physical fitness are the definition of the construct and the 
selection of appropriate indicators. The position taken here is that physical fitness is a 
multidimensional construct and that physical fitness cannot be adequately unaerstood if this 
multidimensionality is ignored. 

Although there are many definitions of fitness, Clarke's general definition is widely 
accepted: "the ability to carry out daily tasks with vigor and alertness, without undue fatigue, and 
with ample energy to enjoy leisure-time pursuits and to meet unforeseen emergencies" (1976, p. 
12) and "Physical fitness is the ability to last, to bear up, to withstand stress, and persevere under 
difficult circumstances where an unfit person would quit. It is the opposite to becoming fatigued 
fromm ordinary efforts, to lacking energy to enter zestfully into life's activities, and to becoming 
exhausted from unexpected, demanding physical exertion" (Clarke, 1979, p. 28). Safrit (1981, p. 
213) stressed that physical fitness is a multidimensional construct that cannot be adequately 
reflected by a single measure, and that physical fitness tests should measure the full range of 
functional capacities and accurately reflect changes in appropriate physical activity and altered 
capacity. Similarly, based on his extensive literature review of physical fitness tests, Fleishman 
(1964, p. 37) concluded that: "There is no such thing as general physical proficiency. The 
problem is a multidimensional one." Baumgartner and Jackson (1987, p. 277) noted that "as the 
concept of physical fitness has moved away from athletic participation toward health-related 
fitness, there has been greater emphasis on cardiovascular function, body composition 
(leaness/fatness), strength, endurance, and lower-back flexibility, traits shown by medical and 
exercise scientists to promote health and reduce the risk of disease." 

The original American Association for Health, Physical Education and Recreation 
(AAHPER) test battery had considerable impact on theory, measurement, research and practice 
of youth fitness assessment. Items were selected according to the requirements of: (a) minimal 
equipment, (b) student familiarity, (c) ease of administration by classroom teachers, (d) 
appropriateness across gender and (adolescent) ages, and (e) broad selection of different fitness 
components. The original battery consisted of 7 items: pullups, situps, shuttle run, standing long 
jump, 50 yard dash, 600 yard run, and a softball throw. Its major advantages were ease of 
administration, objectivity, face validity, and the availability of nationally representative 
normative data. The AAHPER battery in various forms has been repeatedly administered to large 
representative samples of school children in many Western countries including the US, UK, 
Australia, and New Zealand since the late 1950s. Evaluations of the construct validity (e.g., 
Baumgarten and . T ackson, 1987; Cureton, 1980; Safrit, 1981) of the items, however, has called 
into question some aspects of the test and led to its subsequent revision. Ponthieux and Barker 
(1963) factor analyzed the AAHPER items and reported three factors defined primarily by the 
600 yard run, pull-ups, and sit-ups, by the long jump, shuttle run, and 50 yard dash, and a single- 
item factor defined by the softball throw. In his classic factor analysis study of a wide variety of 
physical activities including some of the AAHPER items, Fleishman (1964) reported that 4 of the 
7 items (shuttle run, softball throw, 50-yard dash, and standing long jump) loaded on a factor that 
he called explosive strength. Reflecting these and other concerns, the battery was subsequently 
altered, for example, by replacing the pull-up for girls with the flexed-arm hang, eliminating the 
softball throw that apparently has a substantial skill component, replacing the 600 yard event 
with a longer one mile or 9 minute run (Disch, Jackson, and Frankiewicz, 1975), modifying the 
sit-up test, and adding the skin fold measurements to assess body composition, and the sit-and- 
reach test to assess flexibility (AAHPERD, 1980). 
Aerobic Power and VO omax 

Increasingly - in apparent contrast to the multidimensional perspective emphasized here 
- researchers have adopted an implicitly unidimensional approach in which physical fitness is 
defined in terms of aerobic power, reflecting the integration of the cardiovascular, pulmonary, 
vascular, and muscular systems. In field exercises it is measured indirectly by items such as 
running a moderately long distance (e.g., 1.6 km run or 12 minute run) or step tests. With more 
sophisticated equipment it is measured by the working capacity or power produced at a given 
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heart rate of 170 beats/minute (PWC 170) using a bicycle ergometer or treadmill. In laboratories 
with sophisticated equipment the "gold standard" measure of cardiovascular fitness is maximum 
oxygen intake (VC^max). For adults, the usual criteria that VOomax has been achieved (e.g., 
Boutcher, 1990; Cunningham, 1980; Schell & Leelarthaepin, 1990) are: the engagement in 
strenuous exercise involving continuous, rhythmeic movement lasting at least 15 minutes (e.g., 
running, cycling, swimming) that use large muscles and depend on oxidative energy systems; 
anaerobic metabolism as indicated by high levels of lactic acid; and a plateau in oxygen intake 
(VC^max) with increasing peak work loads. Zwiren, Freedson, Ward, Wilke and Rippe (1991) 
recently compared direct measures of VC^max with estimates based on five exercises for young 
adult females. They reported that VC^max was more highly correlated with performance on a 
1.5 mile run (r=.79) than with a step test (r=.55) or from heart rates on a submaximal ',ycle 
ergometer (r=.66). These results are consistent with a number of studies reporting high 
correlations between VO^max and running speed over distances of 1.6 km or more in which 
correlations as high as .9 nave been reported (Cooper, 1968). There is, however, considerable 
variation in these results and Cunningham (1980) indicated stronger relations are typically 
reported when students are more successfully motivated. 

The trend towards defining physical fitness exclusively in terms of VC^max is 
unfortunate, and appears to reflect a confusion between physical fitness which is a hypothetical 
construct and VC^max which is only one indicator of this construct. Furthermore, this reliance 
on VC^max as the "gold standard" measure of fitness leads to implicit assumptions about the 
structure of physical fitness that may be unwarranted. In particular, this situation seems to imply 
a relatively unidimensional construct of physical fitness that is inferred by VC^max and that 
other indicators of fitness are important primarily in terms of how they relate to VOomax and 
cardiovascular endurance. In contrast to this implicitly unidimensional perspective, Safrit (1981) 
argues that physical fitness is a multidimensional construct that cannot be adequately represented 
by a single indicator. Bar-Or (1987) argued that this emphasis of VC^max ignores other 
components of youth fitness — particularly childhood obesity that initiates a vicious circle that 
includes decreasing physical activity, poor self-esteem, and the inability to socialize. Baumartner 
and Jackson (1987) noted that adult obesity is a serious health problem and that 85% of adult 
obesity is linked to childhood obesity. Cureton (1987, p. 319) argued that "less attention should 
be given to cardiovascular fitness and more to the relation of physical activity and fitness to 
health/disease risk." Seefeldt and Vogel (1987) noted the dangers associated with over-reliance 
on a single-indicator approach, arguing instead for more broadly based, multidimensional 
definitions of fitness. Sallis (1987) argued that from the perspective of public health, physical 
activity is more important than fitness per se, and that the major benefit of physical activity in 
childhood is to establish patterns that are carried into adulthood. The American College of Sports 
Medicine (1990) recognized that moderate levels of physical activity that are insufficient to 
influence VO^max may have important benefits on physical health. Similarly, Sharkey (1991) 
noted that health benefits apparently plateau at relatively low levels of VC^max and proposed 
alternative tests that place more emphasis on what he referred to as endurance fitness. Also, 
Boutcher (1990) noted that as much as 90% of the variance of VC^max may be genetically 
determined, calling into question the usefulness of VOomax measures taken at a single point in 
time. Hence, particularly in relation to health-related fitness and multidimensional perspectives 
of physical fitness, the trend toward VC^max as the "gold standard" measure of fitness is 
premature and unwarranted. More generally, it is important to evaluate VC^max within a 
broac ^r context of within- and between- network studies that incorporate a wide range of 
different fitness indicators and to establish its location within the multidimensional structure of 
physical fitness. Due in part to the typically small sample sizes in VC^max studies, there 
apparently has been no large-scale factor analyses that included VC^max and indicators of a 
wide variety of other components of physical fitness. 
The Structure of Physical Fitness: A Factor Analysis Approac h 

Edwin Fleishman's classic research (1964) on the structure and measurement of physical 
fitness provides an important basis for the present investigation. He applied factor analytic 
techniques to identify the components of physical fitness and to select appropriate indicators to 
include in a comprehensive fitness battery. Based on his review of factor analytic studies of 
physical fitness, Fleishman proposed specific components of fitness that he broadly classified 
into factors of strength that are the primary focus of the present investigation and factors of 
speed, flexibility, balance and coordination. 
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In the strength area, his resuks supported a priori predictions of separate components of 
strength and, perhaps, strength factors that are specific to different parts of the body. In 
particular, he demonstrated the distinctions between dynamic strength, static strength, and 
explosive strength. Dynamic Strength was defined by items such as pull-ups, push-ups, bent arm 
hang, rope climb, dips and squat thrusts "in which the arms are required repeatedly or 
continuously, to move to support the weight of the body" (p. 64), although short running tests, 
vertical and broad jumps, and sit-ups also loaded on this factor. Static Streng th was defined 
primarily by the use of dynamometers and items reflecting the capacity to apply force to lift or 
push weights; a maximum force is exerted for a brief period of time where the force is exerted 
continuously up to this maximum (p. 65). Explosive Streng th was defined by items such as short 
dashes, long and vertical jumps, and a Softball throw that emphasize "the ability to extend 
maximum energy in one explosive act ... rather than continuous strain, stress, or repeated 
exertion" (p. 66). Fleishman also noted some evidence for separate components that are specific 
to particular parts of the body (arm, leg, trunk) or particular activities (e.g., running). Depending 
on the length of the test battery, Fleishman recommended the inclusion of tests of dynamic 
strength (push-ups, pull-ups), static strength (hand grip, arm pull dynamometer), explosive 
strength (50-yard dash or shuttle run, long jump, and softball throw), and trunk strength (leg lifts 
and hold half sit-up). 

Fleishman's (1964) corresponding analysis of speed, flexibility, balance and coordination 
tests is less relevant to the present investigation because only one such test (sit-and-reach) was 
included in the test battery used here. He reported 6 factors that only partially supported his 
hypotheses: extent flexibility, dynamic flexibility, gross body equilibrium, balance with visual 
cues, speed of limb movements, and an explosive strength factor (like that in the analysis of 
strength tests). 

The major emphasis of Fleishman's research was on physical fitness as a 
multidimensional construct. This emphasis was also reflected in his Performance Record for 
Basic Fitness Tests (Fleishman, 1964) in which a multidimensional profile of physical fitness 
components is presented. For each component, raw, percentile, and stanine scores are presented. 
Fleishman (1964, p. 136) specifically noted that "the most useful information is provided by 
separate tests, since this allows the pinpointing of strong and weak areas." Apparently bowing to 
popular demand, however, he also included a total fitness score — the sum of the stanine scores — 
because "many instructors and students feel the need for a single index to summarize a student's 
over-all performance" (p. 141). He justified this total score in part because the battery of tests 
reflected separate factors so that no one factor was unduly weighted. Fleishman also noted, 
however, that it may be better to differentially weight each indicator in relation to a particular 
criterion using statistical techniques such as multiple regression, but argued that in the fitness 
area such external criteria are seldom available. Further de-emphasizing the total score, 
Fleishman noted that it was primarily useful for quick comparisons but that "one should not lose 
sight of the fact that an FI [total fitness index] of 50 could be achieved by average performance 
on all tests, or by a combination of exceptional and inferior performances on a number of 
different tests" (p. 142). In this sense, overall fitness according to Fleishman (1964) is best 
represented as a multidimensional profile of different components of physical fitness. 

There was little emphasis on cardiovascular endurance in Fleishman's (1964) study. 
Although he specifically noted the possibility of a cardiovascular endurance factor measured by 
long distance runs and prolonged body exertion, he did not consider relevant indicators in his 
factor analysis. Noting that "in retrospect, it would have been useful to include some variant of 
the 600 yard run-walk test" (p. 70), he concluded that the relation between cardiovascular 
endurance and the strength factors that he did consider "remains to be seen" (p. 71). However, in 
the large normative study of his Basic Fitness Tests, he indicated that "while it was not feasible 
to include the 600- Yard Run Walk in our experimental studies, this test was added in later stages 
of our normative study to provide a measure of Stamina or Cardio- Vascular Endurance" (p. 104). 
Thus, although he does not evaluate the relation between cardiovascular fitness and other fitness 
indicators, Fleishman does provide normative comparisons for boys and girls of differing ages on 
one indicator of this factor. 

Fleishman (1964) did not specifically address the issue of how well his a priori structure 
of physical fitness generalizes across age and gender. This is not, perhaps, surprising since both 
his major factor analysis studies involved United States Navy recruits who were relatively 
homogeneous in terms of age and gender. In his much larger "national study" boys and girls 
between the ages of 12 and 18 were tested. In this study, considerable attention was given to age 
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and gender differences in levels of performance. However, apparently because only one 
indicator per factor was used to define physical fitness in the national study, Fleishman (1964) 
did not present factor analyses of these data or report the consistency of relations among the 
indicators over gender and age. Hence the critically important issue of the generalizability of the 
structure of physical fitness over age and gender was not evaluated by Fleishman (1964). 

Fleishman's (1964) factor analytic research of physical fitness reflected the "state of the 
art" in the early 1960s. In the ensuing three decades, however, there have been important new 
developments in the application of factor analysis and indices used to infer physical fitness. 
Thus, it would seem likely that the physical fitness literature should contain a large number of 
factor analytic studies following the important tradition established by Fleishman. Remarkably, 
however, a review of the current physical fitness literature reveals a surprising dearth of factor 
analytic studies. A computer search of several major indices (ERIC, Pschlnfo, AUSSPORT, 
MEDLINE) revealed thousands of studies with "factor analysis" or "physical fitness" as a 
descriptor, but almost no studies with both descriptors. Searches using these descriptors revealed 
no factor analyses of a comprehensive selection of physical fitness indicators published between 
1980 and 1992 (except, perhaps, Hagan, Parrish, and Licciardone, 1991), and only a few relevant 
studies published between 1966 and 1980. Whereas this search does not mean that there have 
been no comprehensive factor analytic studies of physical fitness measures in the last decade, it 
seems reasonable to conclude that there has been limited recent work in this area. It is not that 
the relevance of this factor analytic approach has waned in that: (a) textbooks in physical 
education and related areas almost universally emphasize components of fitness based in part on 
factor analytic studies such as Fleishman's (1964) research; (b) there currently exists a heated 
debate about the relative importance of VC^max as the "gold standard" measure of physical 
fitness and a multidimensional perspective to physical fitness; and (c) there is a growing number 
of physical fitness batteries that are based on implicit, apparently untested, underlying factor 
structures that are assumed to generalize across age, gender, and other individual characteristics. 
In a related concern about fitness batteries, Sharkey (1991; pp. 5-6; also see Sharkey, 1988) 
noted that "these fitness batteries are based on a concept of generality, wherein factor analysis 
indicates common clusters or groups of items associated with a component of fitness" and that "it 
is clear that no single test or component of fitness adequately represents the entire component." 
It seems that researchers and practicioners have relied on classic factor analytic studies such as 
Fleishman's research and intuition to classify an ever increasing number of specific physical 
fitness indicators into largely untested categories of physical fitness, implicitly assuming that 
each indicator reflects the same component of physical fitness with equal validity for subjects 
differing in age, gender, and other individual characteristics. Whereas such blind faith may be 
justified by intuition and expert opinion, it is also important to pursue empirical tests of these 
implicit assumptions. 

The Present Investigation 

The purpose of the present investigation is to test the ability of an a priori factor structure 
of physical fitness to account for data based on the Australian Health and Fitness Survey 
(AHAFS) and the extent to which the same factor structure fits data for boys and girls aged 9, 12 
and 15. More generally, the present investigation extends the factor analytic approach pioneered 
by Fleishman (1964), incorporating subsequent developments in the methodology of factor 
analysis. In particular, in "exploratory" factor analysis that was the "state of the art" at the time 
Fleishman did his research, the researcher has little control over the resulting factors. Whereas 
researchers may predict an a priori factor structure, they have no way of testing the ability of 
their a priori factor structure to fit their data. Instead, support for a priori predictions are based 
on the extent to which the factors that "come out" match those that were predicted or match those 
obtained in other factor analyses. This exploratory mode of factor analysis is being replaced by a 
"confirmatory" approach to factor analysis in which the researcher specifically tests the ability of 
an a priori factor structure to fit the data, thus providing a much stronger basis for testing theory. 
Furthermore, tests of factorial invariance allow researchers to constrain any one, any set, or all 
parameter estimates to be constant across two or more groups. The present investigation is a 
within-network study of construct validity in which I test an a priori structure of physical fitness. 
This is not to deny the critical importance of between -network studies that relate components of 
physical fitness to other constructs such as health, athletic accomplishments, and psychological 
well-being. Rather, consistent with Fleishman's underlying premise, it is argued that the relation 
between physical fitness and external constructs can not be adequately understood unless there is 
reasonable resolution of within-construct issues about the structure of physical fitness. 
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Data for the present investigation are based on the Australian Health and Fitness 
Survey (AHAFS) that was conducted in 1985 with a nationally representative sample of 
Australian school children aged 7 to 15 (Pyke, 1987). The survey consisted of a comprehensive 
array of health and fitness measures including: (a) an extensive survey of sport/physical activities 
and health-related behaviours (e.g., smoking), (b) field exercises including measures of 
cardiovascular endurance (1.6K run), dynamic strength (situps, pushups), explosive 
strength/power (standing long jump, 50M dash), flexibility (sit-and-reach), and body composition 
(height, weight, and body girths); (c) technical measures (PWC170, dynamometer strength tests, 
skinfolds, blood pressure, and lung function), and sophisticated laboratory measures (VC^max). 
In contrast to most youth fitness research, the AHAFS included technical and laboratory 
measures of fitness. In contrast to most laboratory research the AHAFS was based on a large, 
representative sample and included a wide array of non-laboratory measures. This data provides 
a strong basis for the present investigation because: considerable expertise was called upon in 
selecting tests for inclusion in the battery; the tests and protocols for their administration were 
extensively pilot-tested (including the production of an instructional video); and the nationally 
representative data provide a unique basis for evaluating the underlying structure of physical 
fitness and its generality across gender and age. (Surprisingly, given the comprehensiveness of 
this data, almost no analyses based on it have been published in major international journals and 
none have used this data to evaluate the structure of physical fitness.) 

Methods 

Sample and Procedures. 

A detailed description of the sampling design, test selection, testing protocols, and 
collection of the data is presented by Pyke (1987) and is summarized here only briefly. The 
target population consisted of all students aged 7 to 15 enrolled in Australian schools in 
September, 1985. A two-stage probability sample was used in which 52 primary and 52 
secondary schools were randomly chosen, and samples of 10 boys and 10 girls from each age/sex 
category were selected from each school. For present purposes only 6 groups are considered 
(boys and girls aged 9, 12, and 15) that were administered technical and laboratory measures as 
well as the field exercises that were completed by all participants. The ages were selected "to 
approximate the pre-pubertal, pubertal, and postpuberal growth stages" (Pyke, 1987, p. 10). 
Because of the added expense in obtaining VC^max measures, subsamples of students within 
each of the 6 groups were selected. Students were selected who had previously completed other 
items from schools that were within a reasonable proximity of laboratory facilities where this 
testing was conducted. 

Insert Table 1 About Here 

All measures considered here (see Table 1) except for VO^max were collected by data 
collection teams that went to participating schools. A total of 10 data collection teams, each 
consisting of 10 data collectors and a supervisor, were used. In the initial testing session, students 
completed height, weight, girth, sit and reach, long jump, push-up, sit-ups, skin fold, lung 
function, and dynamometer strength tests. These tests were conducted indoors and the order of 
presentation was varied. Following the completion of the indoors testing, the 50 M dash and then 
the 1.6 K run were conducted out of doors. Blood pressure and PWC170 were measured the 
following day. The VChmax testing was independently conducted by staff in exercise 
physiological laboratories in tertiary institutions. In all tests, students were encouraged to do their 
best without any pressure being applied. Parental consent was obtained prior to collection of data 
and students could withdraw from the study at any time. 
Statistical Analyses. 

An overview of the CFA approach. A detailed presentation of the conduct of 
confirmatory factor analysis (CFA) is beyond the scope of the present investigation, and is 
available elsewhere (e.g., Bollen, 1989; Byrne, 1989; Hayduk, 1987; Long, 1983; Joreskog & 
Sorbom, 1989; Marsh, 1987; Marsh & Hocevar, 1985; McDonald, 1985; Pedhazur & Schmelkin, 
1991). Briefly, the researcher posits an a priori structure and tests the ability of a solution based 
on this structure to fit the data. In a CFA study, the parameters typically consist of factor 
loadings (the relations between measured variables and the latent factors), factor variances and 
covariances (relations among the factors) and factor uniquenesses (a combination of specific and 
error variances like 1 minus the communality estimates in traditional, exploratory factor 
analyses). As in exploratory factor analyses, the factor loadings are of central importance in the 
"definition" of each factor. In the present investigation, for example, a total of 25 physical fitness 
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indicators (see Table 1) are hypothesized to represent 9 physical fitness factors. A particularly 
parsimonious model would be one in which each indicator had a non-zero factor loading on only 
the factor that it was hypothesized to measure. Thus, for example, factor loadings relating 
VO^max, the 1.6K run, and PWC170 to the first latent factor - Cardiovascular Endurance - 
would be freely estimated, but the factor loadings relating these indicators to other factors would 
be fixed to be zero. 

In addition to factor loadings, there are also uniquenesses associated with each of the 
indicators that reflect a combination of error and specific variance. These uniqueness terms are 
often hypothesized to be uncorrected, but it is also possible to fit correlated uniquenesses to 
reflect relations between individual indicators that cannot be explained in terms of the a priori 
factors. In the present application, for example, correlated uniquenesses were posited for the 
following pairs of indicators: the two running measures (1.6 K run and 50 M dash); the two static 
strength indicators involving the shoulder (shoulder push and shoulder pull); the two static 
strength measures involving hand grip strength (right grip strength and left grip strength); and the 
two skinfold measures involving the arm (biceps and triceps). These correlated uniquenesses are 
apparently consistent with the general "running" factor and fitness associated with specific parts 
of the body proposed by Fleishman (1964). 

In the CFA approach, diagnostics such as the modification index in LISREL (Joreskog & 
Sorbom, 1988) indicate whether freely estimating a parameter that has been constrained (e.g., a 
factor loading that has been fixed to be zero) will substantially improve the fit of the model. 
Using a step-wise process, this information can be used to improve the model. Because this 
process capitalizes on chance (in much the same way as step-wise approaches to multiple 
regression), it should be used cautiously. Thus, Joreskog and Sorbom recommend that 
researchers should only free parameters that can be justified from a substantive point of view. If 
many parameters are freed, then it is important to compare critical parameter estimates in 
solutions based on the original a priori model and the final a posteriori model. Ultimately, a 
posteriori models that differ substantially from a priori models should be cross-validated with 
new data. 

Tests of factorial invariance. When parallel data exists for more than one group, CFA 
provides a particularly powerful test of the equivalence of solutions across the multiple groups. 
The researcher is able to fit the data subject to the constraint that any one, any set, or all 
parameters are equal in the multiple groups. The minimal condition for "factorial invariance" is 
the equivalence of the factor loadings in multiple groups, although Byrne, Muthen and Shavelson 
(1989) argued for the usefulness of "partial invariance" in which the factor loading for at least 
one indicator per factor is invariant across groups. It is also of interest to test for the invariance 
of factor correlations (see Marsh & Hocevar, 1985) that reflect relations among the different 
factors. Of less relevance is the invariance of factor variances and the uniquenesses associated 
with individual indicators. Particularly when the focus of the CFA is to test the invariance of 
solutions across multiple groups, it is critical that analyses are conducted with covariance 
matrices in which variables are scaled along a common metric across the multiple groups (e.g., 
the original score values or the same transformation across all groups) and not correlation 
matrices in which each group is scaled in relation to its own mean and standard deviation (i.e., a 
different transformation for each group; for further discussion see Joreskog & Sorbom, 1988). 

Goodness of fit. A critical issue in the application of CFA is how to determine whether a 
solution based on an a priori model adequately fits the data or how to compare the relative fit of 
competing models. The evaluation of goodness of fit is not fully resolved, but a general approach 
is to: (a) determine that the iterative procedure converges to a proper solution that is well-defined 
(e.g., the solution has no parameter estimates that have impossible values such as negative 
variances or correlations greater than 1.0); (b) establish that parameter estimates are 
substantively reasonable in relation to the a priori model and common sense (e.g., if 4 of 5 
indicators of static strength have positive factor loadings but the factor loading for the remaining 
indicator is negative, then the solution does not make sense and should be interpreted with 
extreme caution); and (c) evaluate the X 2 test statistic and various fit indices in relation to rules 
of thumb and values from competing models. Whereas, there is an emphasis on goodness of fit 
indices in CFA studies, it should be noted that the first two criteria are logical prerequisites to 
evaluating goodness of fit indices. If the empirical solution is improper, than the parameter 
estimates and fit indices should only be evaluated with extreme caution — if at all. If the 
parameter estimates are not consistent with the a priori model and make no sense, then goodness 
of fit indices may be irrelevant. 
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In an evaluation of goodness-of-fit indices typically used in CPA, Marsh, Balla and 
McDonald (1988; also see McDonald & Marsh, 1990) noted that the Tucker Lewis Index (TLI) 
was the only widely used index that was relatively independent of sample size and relatively 
unaffected by the inclusion of additional parameter estimates that were known to have zero 
values in simulated data, and so it is emphasized here. McDonald and Marsh noted that the 
widely used Bentler-Bonett index is biased, and presented an alternative to it based on 
noncentrality - the relative noncentrality index (RNI) - that is not biased (also see Bentler, 
1990). McDonald and Marsh recommended that parsimony should be considereu *u evaluating 
goodness of fit. Following Mulaik, et al. (1989), they used the parsimony ratio defined as the 
ratio of the degrees of freedom in the model to be tested and a suitably defined null model (here 
taken to be a model which produces a diagonal reproduced covariance matrix in which all 
measured variables are assumed to be uncorrected and the degrees of freedom is equal to the 
number of measured variables). Thus the parsimony ratio reflects the complexity/parsimony of 
the model and not the ability of the model to fit the data. All other things being equal, more 
parsimonious models are preferable to more complex models. However, McDonald and Marsh 
(also see Marsh & Balla, 1992) questioned the apparently arbitrary operation alization of 
parsimony in parsimony indices that are defined as the product of the parsimony ratio and some 
other index of fit, but argued that if parsimony indices are to be used then they should be based 
on an unbiased index such as the RNI. For present purposes I emphasize the TLI, but also present 
the X% RNI, parsimony ratio, and the parsimony index based on the RNI (PRNI) and note that 
most other indices of fit can be derived from the information that is presented. 

The present application. The data for the present investigation are covariance matrices 
(or, equivalently, correlation matrices supplemented by standard deviations; see Appendix) for 
each of the 6 groups (boys and girls aged 9, 12, and 15). All statistical analyses were conducted 
with the commercially available "mainframe" version of LISREL 7 (Joreskog & Sorbom, 1988). 
In order to facilitate interpretations each indicator was standardized in relation to the "total 
group" mean and standard deviation of scores across all six groups and the two running 
indicators (1.6K run and 50 M dash) were reverse scored so that higher values reflect better 
levels of fitness. (Note that all subgroups were standardized in relation to the same mean and SD 
rather than standardizing each group in relation to its own mean and standard deviation.) 
Because there was relatively little missing data (except for VOgTTiax for which only a subsample 
of students were tested), students with missing values for more than 2 indicators (other than 
VOomax) were excluded, but all students with VC^max scores were automatically retained. The 
total N across all 6 groups was 2,817 (an average of 469.5 per group), but only 277 students (an 
average of 46 per group) had VOofTiax scores. Furthermore, because of the two-stage clustered 
sampling design, standard errors based on the assumption of simple random sampling 
substantially overestimate sampling variability in summary statistics and distort tests of statistical 
significance (see NCES, 1986, for related discussion of the High School and Be, ond Data that 
also used a two-stage sampling scheme). To compensate for this bias, the effective sample size 
was estimated to be 1800 (an average of 300 students per group). It is important to note that this 
correction has no effect on any of the parameter estimates; it only effects the degrees of freedom 
used in tests of statistical significance. 

Results 

The Initial Model 

In the first stage of the analyses, I tested the ability of the a priori model (see Table 1) to 
fit the data separately for each of 6 groups (boys and girls aged 9, 12 and 15). Several features of 
the present investigation, however, require special attention. Ideally, for purposes of CFA, there 
should be three or more good indicators of each factor. In the present application, however, 4 
hypothesized factors have only 2 indicators and Flexibility has only 1 indicator. Single-indicator 
factors can be considered, but they provide a weak basis for testing construct validity of a factor 
and for the appropriate correction for error that are possible when there are multiple indicators. 
Two-indicator factors, although globally identified in most applications, may result in unstable or 
improper solutions. One expedient approach to such problems is to require the two indicators to 
load equally on each factor, thus reducing the number of estimated parameters and typically 
producing a more stable solution that is less likely to be improper. 

In the present investigation, nine-factor solutions did not result in proper solutions when 
tested separately for each group. In an attempt to resolve this problem, factor loadings for all 
two-indicator factors were required to be equal. (Also, because PWC170 had consistently small 
loadings on the Cardiovascular Endurance factor, the remaining two indicators were also 
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required to load equally.) Even these equality constraints, however, did not result in proper 
solutions for all groups. One problem was that the correlation between Explosive Strength and 
Dynamic Strength was consistently close to 1.0 and sometimes exceeded 1.0 (which is, of 
course, an improper solution). When these two factors were combined to form a single factor - 
subsequently called Explosive/Dynamic Strength — the 8-factor model resulted in proper 
solutions for all 6 groups. The implications of this initial decision « in terms of the application of 
CFA and the structure of physical fitness - is evaluated subsequently in gre ater detail.) 

Insert Tables 2 and 3 About Here 

In further refinements of flus model, several other improvements were made (based on 
LISREL's modification indices). First, as indicated earlier, correlated uniquenesses between 
several pairs of indicators were atiJed. Also, although the PWC170 factor had only a small 
loading on the Cardiovascular Endurance factor, it had a substantial loading on the Static 
Strength factor. Finally, the number of push-ups loaded negatively on the Body Girth factor 
(indicating that individuals with larger bodies are less proficient at push-ups), Parameter 
estimates based on this model are presented in Table 2. This model resulted in fully proper 
solutions for all 6 groups in that the iterative process converged, no parameter estimates for any 
of the groups fell outside their permissible values, and matrices of parameter estimates were 
positive definite. The factors are well-defined in that - with the exception of the PWC170 - all 
indicators load substantially on the factor that they were hypothesized to represent. Furthermore, 
the goodness of fit indices (Table 3) are reasonable and consistent for each of the 6 groups 
considered separately and for the total across the six groups (e.g., TLI and RNI indices all 
approximate the .9 value that is typically interpreted to reflect an adequate goodness of fit). An 
inspection of the results for the six groups suggested that at least the factor loadings are 
reasonably consistent across groups, but the CFA approach offers much stronger tests of the 
equivalence of solutions across groups. 
The Invariance of Solutions Across Gender and Age. 

A substantively important issue in the present investigation is to evaluate the extent to 
which the physical fitness factor structure is the same for boys and girls of different ages. With 
the CFA approach it is possible to constrain any one parameter, any set of parameters, or all 
parameters to be the same across any two groups, any set of groups, or all groups. To the extent 
that a more parsimonious solution with such invariance constraints is able to fit the data, then 
there is support for the invariance constraints. If, however, the imposition of such invariance 
constraints results in a substantially poorer fit, then there is evidence against the invariance 
constraints. 

In the present investigation I evaluate factorial invariance in relation to the six groups 
(boys and girls of three ages) and four sets of parameters (factor loadings, factor variances, factor 
correlations, and uniquenesses). I begin with tests of the equality of factor loadings across all 
groups, followed by tests of factor correlations, and then consider tests of uniquenesses and 
factor variances that are substantively less important. As expected, the results vary logically 
depending on the goodness of fit index. The RNI index is monotonic with model complexity, in 
that requiring any parameters to be equal in two groups cannot result in an improved index and 
will result in a poorer index unless the two parameters happen to be exactly equal when no 
constraints are imposed. For the RNI, there is support for an equality constraint if the decrement 
in fit resulting from its introduction is small. The TLI typically behaves similarly to the RNI, but 
contains a penalty for model complexity such that it is technically possible for the introduction of 
invariance constraints to result in an improved TLI. The PRNI (the parsimony index based on the 
RNI) severely penalizes model complexity and leads to the selection of more parsimonious 
models than the other indices. 

For all indices there is strong support for the invariance of factor loadings and factor 
correlations. In fact, the TLI that is emphasized here, is slightly better for the model imposing the 
complete invariance of factor loadings and factor correlations across all six groups (.890; Table 
3) than the corresponding model with no invariance constraints (.888; Table 3). Because the 
model with factor loading and factor correlation invariance is so much more parsimonious than 
the model with no invariance constraints (see parsimony ratio in Table 3), the PRNI parsimony 
index is substantially better for the model with invariance constraints (.842 vs. .742). Whereas 
the RNI always favors the model with no invariance constraints, the difference is small (.897 vs 
.908) in relation to the substantial difference in parsimony. Thus, the results of these invariance 
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tests provide good support for the invariance of factor loadings and factor correlations across 
gender and age. 

In contrast, there is not such good support for the invariance of factor variances or the 
uniquenesses (Table 3). These results are consistent with the observation that the variances 
associated with each measured variable differs systematically with gender and particularly age 
(see Appendix), and these differences must be reflected in larger factor variances, larger 
uniquenesses, or both. As noted earlier, the invariance of factor variances and uniquenesses is 
substantively less important than the invariance of factor correlations and particularly the factor 
loadings. Furthermore, even though there is not support for the invariance for factor variances 
and uniquenesses across all 6 groups, it is possible that invariance constraints are supported 
within more specific sets of groups. In particular, it is relevant to test the invariance of solutions 
over age separately for boys and girls, and to test the invariance of solutions over gender 
separately for students aged 9, 12, and 15. 

Invariance Across Age For Each Gender and Across Gender For Each Ape. 

Consistent with results based on tests across all 6 groups, there is good support for the 
invariance of factor loadings and factor correlations within each of the more specific tests 
summarized in Table 4. For each age considered separately, there is support for the invariance of 
these parameter estimates across scores for boys and girls. Similarly, for boys and girls 
considered separately, there is good support for the invariance of these parameters over age. 
These results further s ubstantiate interpretations based on all 6 groups. 

Insert Tables 4 and 5 About Here 

Also consistent with earlier analyses of the 6 groups, there is poorer support for the 
invariance of factor variances and uniquenesses. This lack of invariance, however, varies 
consistently depending on the comparison. The lack of invariance is evident in tests of invariance 
over age considered separately for girls and for boys (Table 4). In contrast, there is better support 
for the invariance of factor variances and, perhaps, uniquenesses over gender for the separate 
analyses of each age group. In particular, there is good support for the complete invariance of all 
parameter estimates across gender for 12 year olds (TLIs of .896 vs. .889; Table 4) and for the 
invariance of at least the factor variances for 9 year olds (TLIs of .878 vs. .878; Table 4). Support 
for the invariance of these parameter estimates across gender is weaker for 15 year olds. These 
results suggest that the lack of invariance in factor variances and factor uniquenesses is due 
primarily to comparisons across different ages, although there are also gender differences in 
these parameter estimates for 15 year olds. 

Finally, combining these results, I examined various combinations of invariance 
constraints to find the most parsimonious model (see Table 5) that is best able to fit the data. 
Beginning with factor variances, the most parsimonious model able to fit the data constrains 
factor variances to be: equal across 9 year old boys and girls; equal across 12 year old boys and 
girls and 15 year old girls; freely estimated for 15 year-old boys. For uniquenesses, only the 
invariance across 12 year old boys and girls is supported. This "final" model is remarkably 
parsimonious as indicated by the large parsimony index (.968). It requires only 208 parameters to 
be estimated across the six groups which is about half of the 408 parameter estimates required by 
the original model with no invariance constraints. Despite this substantial reduction in the 
number of estimated parameters, the goodness of fit evaluated by the TLI is marginally better for 
the more constrained model (TLIs of .890 vs .888). The RNI that is monotonic with model 
complexity is necessarily better for the original unconstrained model (.908 vs. .892) but the 
difference is small in relation to the change in parsimony. The parsimony index more severely 
penalizes model complexity, and so the advantage of the final constrained model over the 
original unconstrained model is even more extreme according to this index (.864 vs. .742). In 
fact, the parsimony index leads to the selection of even more highly constrained models than the 
"final" model, although the differences are small (see Table 4). 
Substantive Evaluation of Parameter Estimates. 

The final constrained model (Table 2) differs from the original a priori model (Table 1) in 
two major respects. First, the correlation between the Dynamic and Explosive Strength factors 
was so large that the separation of the two factors could not be substantiated. Second, the 
PWC170 was posited to reflect the Cardiovascular factor, but it loads more highly on Static 
Strength. Also, whereas pushups also loads on Girth this is not unexpected (see related finding 
for weight reported by Fleishman, 1964) and pushups loads more highly on the 
Dynamic/Explosive Strength factor that it was intended to measure. The correlated uniquenesses 
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are also theoretically relevant. The correlated uniqueness relating the two running measures — 
1 .6K run and 50M dash - are consistent with the general running measure proposed by 
Fleishman. The correlated uniqueness relating the two shoulder strength tests, the two hand grip 
tests, and the two arm skinfold measures may also be consistent with Flieshman ! s suggestion that 
fitness is specific to particular parts of the body. 

Although no specific pattern of correlations among the different facets was posited a 
priori, these results are a potentially important contribution of the present investigation -- 
particularly since there was such good support for the invariance of the correlations over boys 
and girls and across the three ages. The Cardiovascular factor is substantially correlated with 
Static Strength (.578), but not with Dynamic/Explosive Strength (.067) - even though 
Dynamic/Explosive a. d Static Strength are substantially correlated with each other (.549). 

Not surprisingly, the correlation between Girth and Skinfold (.871) is the largest of all the 
factor correlations. These two factors could not, however, be combined into a single factor 
without substantially hurting goodness of fit. Consistent with this finding is the observation that 
Dynamic/Explosive" strength is more negatively correlated with Skinfold (-.443) than with Girth 
(-. 175), whereas Static strength and Lung Function are more positively correlated with Girth 
(.572 and .467) than with Skinfold (.210 and .168). This contrasting pattern of relations involving 
the Girth and Skinfold factors is theoretically reasonable and demonstrates why it may be 
inappropriate to combine the two factors. Both the Girth and Skinfold factors, however, are 
similarly related to the Cardiovascular (-.422 and -.492) and Blood Pressure (.346 and .240) 
factors. 

Flexibility is not substantially correlated with any of the other fitness factors, although it 
has small positive correlations with the two strength factors (rs of .272 and .229). Similarly, 
Blood Pressure is not substantially correlated with the other fitness factors, although it is 
positively related to Girth, Skinfold, Lung Function, and Static Strength factors, but negatively 
correlated with the Cardiovascular factor. 

Lung Function factor is substantially correlated with Girth (.467) and Static Strength 
(.698), less substantially correlated with other fitness factors, and nearly uncorrelated with the 
Cardiovascular factor (-.029). Whereas the relation between Lung Function and Static Strength 
reflects in part the relation of both these factors to Girth, the size of this relation seems 
surprisingly large. 
Further Tests of the "Final" Model. 

It is also useful to provide further tests of the "final" model and to demonstrate additional 
features of the CFA approach. Whereas constraints on the a priori predictions were required to 
achieve a model that resulted in proper solutions when tests were conducted separately for each 
group, considerable robustness to the solution is added by the introduction of invariance 
constraints across the different groups. Thus, for example, it is not necessary to impose equality 
constraints on the two-indicator factors in order to achieve a proper solution, although relaxing 
this constraint did not substantively influence the overall pattern of parameter estimates and 
resulted in the same T LI=.89Q as the "final" model. 

Insert Table 6 About Here 

Of greater substantive interest was an attempt to fit the original 9-factor model with the 
added stability of the invariance constraints in the final model. The resulting nine-factor model 
fit the data marginally better than the final model (TLIs of .891 vs. .890). Whereas the very large 
correlation between the Dynamic Strength and Explosive Strength factors (.925; see Table 6) was 
less than 1.0, the solution was technically improper in that the factor correlation matrix was not 
positive definite (see Joreskog & Sorbom, 1988). Also, the pattern of relations between these two 
Strength factors and the other fitness factors is very similar. Furthermore, when this model was 
fit separately to each of the 6 groups (i.e., there were no between-group invariance constraints), 
every solution was improper and the estimated correlation between Dynamic Strength and 
Explosive Strength was greater than 1.0 in some of the solutions. These observations - 
particularly the finding that the 9-factor solution was still technically improper - apparently 
provides support for the initial decision to combine the two factors for purposes of the present 
investigation. Substantively, the similarity in the pattern of correlations between these two 
factors and the remaining factors also supports this decision. More generally, other parameter 
estimates based on the 8-factor (Table 2) and 9-factor (Table 6) solutions are very similar, 
suggesting that combining or not combining these two factors is not a critical concern in the 
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present investigation. It may be, however, that the two factors could be better differentiated in 
other studies that have more and, perhaps, better indicators of these two strength factors. 

Summary and Discussion 
The focus of the present investigation is both substantive and methodological. 
Methodologically, the study demonstrates the CFA approach to testing the structure of physical 
fitness. This acknowledges and extends the important factor analytic tradition in physical fitness 
testing established by Fleishman (1964) that has apparently been neglected in the last decade. 
Substantively, the study is important in that it supports with some exceptions - the a priori 
structure of physical fitness and the invariance of this structure across gender and age. These 
findings are important because the present investigation examined a more diverse sample of 
physical fitness indicators than is typically considered, and because the size and 
representativeness of the sample are better than most studies that include technical and laboratory 
measures of fitness. 

Fundamental premises underlying the present investigation are that physical fitness is a 
hypothetical construct that must be validated within a construct validity approach, and that 
physical fitness is a multidimensional construct that cannot be adequately understood if this 
multidimensional^' is ignored. Factor analysis, particularly the CFA approach demonstrated 
here, is an important statistical tool for evaluating both these premises. The final model (Table 
2), although it differs somewhat from the originally posited model (Table 1), provides clear 
support for the multidimensionality of physical fitness. Support for this multidimensional 
structure is particularly strong in that at least the factor loadings and correlations among the 
factors are reasonably invariant for boys and girls aged 9, 12 and 15. This provides much 
stronger support for the empirical factor structure than would a test based on a single group. 

This invariance of the factor structure also has important practical implications for 
physical fitness testing. As noted earlier, inadequate attention has been given to the question of 
whether a given indicator measures the same component of physical fitness with equal validity 
for boys and for girls, and across different ages. If the underlying meaning of a particular 
indicator differs depending on gender or age, then the task of interpreting each indicator and 
assessing physical fitness would be much more difficult. The invariance of the factor loadings 
and the factor correlations bear on two different aspects of this issue. The factor loadings reflect 
the relation between a particular indicator and the underlying latent constructs that it is posited to 
represent - the validity of the indicator. For example, the fact that V0 2 max and the 1.6K run 
(but not PWC170) load substantially on the Cardiovascular factor support their construct validity 
as indicators of this factor. Support for the invariance of these factors implies that these 
indicators are equallv valid for boys and girls of different ages. The invariance of the factor 
correlations indicates that relations among the different factors are the same for boys and girls of 
different ages. Thus, for example, Skinfold is positively related to some components of physical 
fitness and negatively related to others, but the size and direction of these correlations are similar 
for boys and girls of different ages. 

Despite some important strengths of the present investigation and the CFA approach, 
there are also important limitations. Critical limitations inherent in the CFA approach and, to 
some extent, the present investigation, are the number of indicators needed to infer each factor 
and the sample sizes. Obviously, a factor cannot be identified if there are no indicators of the 
factor. Thus, for example, Fleishman (1964) reported six dimensions related to speed, flexibility, 
balance, and coordination. In the present investigation, only one relevant measure of Flexibility 
(the sit-and-reach test) was included, and so this aspect of Fleishman's research could not be 
evaluated with the data considered here. More generally, in the CFA approach there should be at 
least three good indicators of each hypothesized factor. Particularly in relation to the original a 
priori model that posited nine factors (Table 1), this recommendation was only satisfied for only 
3 of 9 hypothesized factors. This limitation of the existing data apparently contributed to the 
improper solutions based on the original a priori model and, perhaps, to the failure to distinguish 
between the Explosive Strength and Dynamic Strength factors. As demonstrated here, expedient 
solutions to this problem include imposing invariance constraints within or between groups, and 
fitting more parsimonious models with fewer latent factors and estimated parameters. A second 
limitation that was not such a serious problem in the present investigation is the large sample size 
required of CFA studies. There are no absolute guidelines about the minimum sample size that is 
required and the requirements may be somewhat idiosyncratic to particular applications. 
Nevertheless, sample sizes of at least 200 subjects (per group) are typically recommended and 
considerably more subjects may be required for models involving many measured variables, 
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latent factors, and estimated parameters (e.g., Tanaka, 1987). Problems associated with small 
sample sizes can, perhaps, also be offset by fitting more parsimonious models or by imposing 
invariance constraints. Ultimately, however, these sample size requirements mean that the CFA 
approach may not be appropriate to many small-scale laboratory studies of physical fitness. 

It is also relevant to consider the implications of the present investigation to notions of 
"overall" physical fitness. The clear support for the multiple dimensions of physical fitness and 
the small correlations among many of physical fitness factors imply that it is inappropriate to 
simply average the different factors - or the indicators used to infer the factors - to obtain an 
overall index of fitness. It is obvious that considerable information in the specific factors will be 
lost in the formation of single total score. A much more useful summary of physical fitness is a 
profile of scores in which each score is compared to standards established in relation to 
appropriate norm groups, criterion references, or multiple sets of scores for the same individual 
collected over an extended period of time (e.g., achieving a "personal best"). In relation to a 
particular criterion, it may be appropriate to provide a single summary score that represents an 
optimally weighted combination of the multiple dimensions in which the weights are established 
on the basis of theory, empirical research, and, perhaps, expert opinion. Even here, however, the 
weight assigned to each dimension is likely to vary considerably depending on the particular 
criterion (e.g., performance in different athhtic tasks, physical health, or psychological well- 
being) and, perhaps, the manner in which the weights are established. Implicit in this weighted 
average approach is the recognition that not all dimensions of physical fitness may be relevant to 
all situations (e.g., the appropriate weight for a particular dimension in a given situation may be 
zero). Thus, consistent with the multidimensional perspective of physical fitness emphasized 
here (also see Fleishman, 1964), the most generally useful summary of physical fitness 
dimensions is a multidimensional profile of scores rather than a single indicator (e.g., VO^max) 
or a total that is based on the implicit assumption that the importance of all dimensions is the 
same for all intended purposes of the physical fitness test. 
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Table 1 

A Priori Fitness Factors and a Description of Indicators 

Cardiovascular Endurance 

Maximal Oxygen Uptake (V0 2 max). A continuous direct measure of maximal oxygen uptake 
(in mL./kg minute) was taken using a treadmill. The initial treadmill speed varied according 
to age and sex, and was increased. 2% every two minutes until criteria of steady state were 
achieved. All results were compiled at the central survey office. 
1.6 K Run. Measured on an oval track of 200 m or 400 m. 

Physical Work Capacity (PWC170). Measured in Kmg/kg x min using a Monark bicycle as a 
continuous test with 3 workloads of 3 minutes each, each being at higher workload that the 
previous workload. Direct measures of heart rate were taken with a stethoscope and 
stopwatch, and PWC170 scores were generated by computer. 
Explosive Strength 

50 M dash. Measured (in sec), after a warm-up, in a single sprint over a flat, cross-wind track. 
Standing Long Jump. The longest of two jumps (in cms) done from a standing take-off. 
Dynamic Strength 

Sit-ups. These were done with knees were bent to 140° in a cadence of 20/minute up to a 
maximum of 100. 

Push-ups. The number done in 30 seconds, using a 46 cm chair with student's feet behind a 
line set at their elbow height from the front of the chair. 
Static Strength (in Kg.) 

Right Grip Strength. The best of two trials with the dynamometer resting on the opposite 
shoulder. 

Left Grip Strength. The best of two trials with the dynamometer resting on the opposite 
shoulder. 

Shoulder Push Strength. The best of two trials with the dynamometer at the level of nipples. 

Shoulder Pull Strength. The best of two trials with the dynamometer at the level of nipples. 

Leg Strength. The best of two trials with back against the wall and knees bent at 115°. 
Flexibility/Joint Mobility 

Sit and reach. The student was seated, stretched as far as possible to hold for 3 seconds. Score 
is the cms. beyond their toes (or negative scores if they do not reach their toes).. 
Blood Pressure (mm Hg measured after 5 min. rest with a mercury sp hygmomanometer) 

Systolic Blood Pressure. The Korotkoff sound I. 

Diastolic Blood Pressure. The mean of Korotkoff sounds IV and V. 
Lung Function (in L using a Vitalograph adjusted to student's height). 

FEV1. The 1 second Forced Expiratory Volume 

FVC. Forced Vital Capacity 
Body Girth (cm. assessed using a constant tension tape) 

Mid-arm Girth 

Waist Girth 

Hip Girth 

Skinfolds (in mm measured with a Holtain calliper). 
Biceps Skinfold 
Triceps Skinfold 
Subscapular Skinfold 
Suprailiac Skinfold 
Midabdominal Skinfold 



Note. The a priori categories are based on the design of the battery (Pyke, 1987) and previous 
research - particularly Fleishman (1964). 
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Tabic 2 

Eight Factor Solution For 12 Year Old Bovs Standardized To a Common Metric 

Variables Factor Loadings Corr a 
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.100 


1 










Lung 


-.029 
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.572 


.005 


.346 


.467 
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Factor Variances 




















.882 1 


.060 


.950 


1.019 


.983 


.988 


1.119 


1. 159 





,196 
.082 



.039 



Note: See Table 1 for a description of the measured variables. This solution based on data for 12 year old boys was 
standardized in relation to a common metric across all 6 groups to facilitate interpretations (see Joreskog & Sorbom, 
1989). The factor loadings and factor correlations that are the major focus of the present investigation were invariant 
across all 6 groups, but factor variances and uniquenesses varied depending on the group (see discussion of results). 
Thus factor loadings and factor correlations presented here are the same for all 6 groups. The average factor variance 
across the 6 groups is 1.0 but differs from group to group. 

a Correlated uniquenesses suggest that the relationship between a pair of measured variables is greater than can be 
explained by the posited structure. The four correlated uniqueness posited here represent relations between the two 
running measures (1.6 k run and 50 m dash); the two static strength indicators involving the shoulder (shoulder push 
and shoulder pull); the two static strength measures involving hand grip strength (right grip strength and left grip 
strength); and the two skinfold measures involving the arm (bicep and triccp). 
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Table 3 

Goodness of Fit For Separate Solutions For Each Group With No Invariance Constraints and For Selected Invariance 
Constraints Imposed Across All Groups 



Model I 


>-Ratio 


X 2 


DF 


TLI 


RNI 


PRNI 


No Invariance Constraints 














Boys Age=9 


.817 


600 


245 


.894 


.914 


.746 


Boys Age=12 


.817 


789 


245 


.899 


.918 


.749 


Boy s Ag e = 1 5 


.817 


664 


245 


.906 


.924 


.754 


Girls Age=9 


.817 


834 


245 


.865 


.890 


.727 


Girls Age=12 


.817 


7 91 


245 


.876 


.899 


.734 


Girls Age=15 


.817 


629 


245 


.881 


.902 


.737 


6-Group Total 


.817 


4309 


1470 


.888 


.908 


.742 


Invariance Constraints 














(the same constraint over all 


6 groups ) 










No Invar 


.817 


4309 


1470 


.888 


.908 


.742 


FL Invar 


.861 


4594 


1550 


.886 


.902 


.776 


FL,FCr Invar 


.939 


4884 


1690 


.890 


.897 


.842 


FL, FCr , FV Invar 


.961 


5506 


1730 


.873 


.878 


.844 


FL, Fcr , U Invar 


1 .017 


7349 


1830 


.824 


.821 


.835 


Total (FL, FCr, FV, U) Invar 


1 .039 


8188 


187 0 


.803 


.796 


.827 



Note. P-Ratio = Parsimony Ratio. TLI =Tucker-Lcwis Index. RNI = Relative Noncentrality Index. PRNI = 
Parsimony RNI. FL = Factor loadings. FCr = Factor Correlations. FV = Factor Variances. U = uniquenesses. The 
total X 2 and degrees of freedom summed across the 6 groups considered separately is necessarily the same as for the 
corresponding analysis across the 6 groups in which no invariance constraints are imposed. 
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Table 4 

Goodness of Fit For Solutions Willi Selected Invariance Constraints Imposed Across Age Within Gender and Across 
Gender Within Age 
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.899 
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.747 


rp^ 4- — "l 
1 Ola 1 


. 852 


452 0 


1534 


.887 


.903 


.770 


Across benaer wiuniii Myi. 














Across Gender For Age=9 


ft A ^ 


1 A Q R 


R D ft 

DUD 


. 876 


. 896 


. 7 55 


Across benaer ror /\ge — l/. 


ft A ? 

, O'l J 


1 D U J. 


50 6 


. 892 


. 9 09 
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Across Age For Boys 


Q 1 A 


<i J 1 


823 


. 904 


. 9 12 


.834 


Across Age For Girls 


.914 


2555 


823 


.854 


.867 


.792 


Total 


.914 


4763 


1646 


. 890 


.899 


.822 


Across Gender Within Age 














Across Gender For Age=9 


.890 


1541 


534 


. 881 


.894 


.796 


Across Gender For Age=12 


.890 


1636 


534 


. 897 


.908 


.808 


Across Gender For Age=lb 


.890 


1469 


534 


.889 


.901 


.802 


Total 


.890 


4637 


1 602 


. 890 


.902 


.803 


Factor Loading, Factor Corr, Factor Variance Invariance 










Across Age Within Gender 














Across Age For Boys 


.932 


2651 


839 


. 880 


. 888 


.828 


Across Age For Girls 


.932 


2646 


839 


.851 


.861. 


.802 


Total 


.932 


5229 


1678 


.877 


.885 


.825 


Across Gender Within Age 














Across Gender For Age=9 


.903 


1589 


542 


.878 


.890 


.804 


Across Gender For Age=12 


.903 


1643 


542 


.899 


.908 


.821 


Across Gender For Age=15 


.903 


1545 


542 


. 882 


.894 


.807 


Total 


.903 


4777 


1 626 


.887 


.898 


.811 


Total (Factor Loading, Corr, Var, and Unique) Invariance 










Across Age Within Gender 














Across Age For Boys 


.994 


3827 


895 


.818 


819 


814 


Across Age For Gir]s 


.994 


3596 


ftQ R 

O J J 


.791 


. 792 


.788 


Total 


.994 


7423 


1790 


.817 


.818 


.813 


Across Gender Within Age 














Across Gender For Age=9 


.950 


1810 


570 


.862 


.869 


.826 


Across Gender For Age=12 


.950 


1757 


570 


.896 


.901 


.856 


Across Gender For Age=15 


.950 


2115 


570 


.827 


.836 


.794 


Total 


.950 


5683 


1710 


.865 


.871 


.828 



ERIC 



validation Index. PRNI = Parsimony RNI. FL = Factor loadings. FCr = Factor Correlations. FV = Factor Variances. 
U = uniquenesses. 
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Table 5 

Goodness of Fit For Solutions With Selected Invariance Constraints Imposed On Specific Groups of Boys (B) and 
Girls(G) Aged 9, 12 and 15. 

Model P-Ratio X 2 DF TLI RNI PRNI 

Factor loadings, Factor Correlations Invariant Across All Groups; 
Uniquenesses Non-Invariant (free) For all groups; 
Patterns of Factor Variance In variances as Follows: 

FV: B9=B12=G9=G12=G15; B15=NI .957 5182 1722 .883 .888 .850 

FV: B9=B12=G9=G12; G15=NI; B15=NT .952 5140 1714 .884 .889 .847 

FV: B9=G9; B12=G12; G15=N1 ; Bl 5=NI .948 4944 1706 .890 .895 .849 

FV: B9=G9; B12=G12=G15; B15=NI .952 4958 1714 .890 .895 .852 

Factor loadings, Factor Correlations Invariant Across All Groups; 

Patterns of Uniqueness (U) and Factor Variance (FV) In variances as Follows: 

U: B9=G9=B12=G12 ; G15=NT; B15=NI; 

FV: B9=G9=B12=G12; G15=NI; B15=NT; .999 5810 1798 .870 .870 .869 

U: B9=G9; B12=G12; G15=NI; B15 = N.l ; 

FV: B9=G9; B12=G12; G15=NI; Bl 5=NI ; .979 5263 1762 .884 .887 .868 

U: B12=G12; B9=NI; G9=NT ; G15=N1; Blb=Nl; 

FV: B9=G9; B12=G12; G15=NI; B15=NI. .963 5052 1734 .889 .893 .860 

U: B9=G9; B12=G12=Glb; Bl 5=N1 ; 

FV: B9=G9; B12=G12=G15; B15=N1; .983 5492 1770 .878 .880 .865 

U: B12=G12; B9=NI; G9=Ni ; G15=N1; B15=N1; 

FV: B9=G9; B12=G12=G15; B15=NI; .968 5068 1742 .890 .892 .864 

Note. P-Ratio = Parsimony Ratio. TLI =Tucker-Lewis Index. RNI = Relative Noncenlrality Index. PRNI = Parsimony 
RNI. FL = Factor loadings. FCr = Factor Correlations. FV = Factor Variances. U = uniquenesses. For all models 
summarized in this table, factor loadings and factor correlations were invariant across all six gender/age groups. In the 
first set of models, factor variances - but not uniquenesses - were constrained to be equal across various combinations 
of groups. Thus, for example, in the final factor variances were held to be invariant in solutions for boys and girls aged 
9, and for boys and girls aged 12 and girls aged 15; factor variances for boys aged 15 were not constrained to be 
invariant with any other groups. In the second set of models, factor variances and uniquenesses were constrained to be 
equal across different combinations of groups. 
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Table 6 

Nine Factor Solution For 12 Year Old Bovs Standardized To A Common Metric 
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U 
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u 
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Factor Correlations 






















Caridovas 


1 
























Explos ive 


.547 


1 






















Dynamic 


.702 


.925 


i 




















Stat ic 


.070 


.517 


CLQ 

. J JO 


I 


















Flexibi 1 


.109 


.242 


.341 


.229 


1 
















Blood Pr - 


.168 


.095 


.145 


.278 




. 092 


1 












Lung 


.019 


.256 


.183 


.690 




.125 


.231 


1 










Girth 


.420 


- .165 


- .256 


.573 




.005 


.321 


.465 


1 








Skinfold - 


.499 


- .418 


- .529 


.210 




.069 


.216 


.170 


.87 1 


1 






Factor Variances 


























.875 


1 .032 


1 . 158 


.945 


1 


.020 


.974 


.989 


1 .122 


1 . 162 







ERIC 



Note: Sec Table 1 for a description of the measured variables. This solution based on data for 12 year old boys was 
standardized in relation to a common metric across all 6 groups to facilitate interpretations (see Jorcskog & Sorbom, 
1989). The factor loadings and factor correlations that arc the major focus of the present investigation were invariant 
across all 6 groups, but factor variances and uniquenesses varied depending on the group (see discussion of results). 
This 9-factor solution is technically improper in that the factor correlation matrix is not positive definite; even 
though none of correlations exceeds 1, the correlation between the dynamic and explosive strength factors (n=.925) 
approaches 1. 

a Correlated uniquenesses suggest that the relationship between a pair of measured variables is greater than can be 
explained by the posited structure. The four correlated uniqueness posited here represent relations between the two 
running measures (1.6 K run and 50 M dash); the two static strength indicators involving the shoulder (shoulder 
push and shoulder pull); the two static strength measures involving hand grip strength (right grip strength and left 
grip strength); and the two skinfold measures involving the arm (bicep and tricep). 
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Appendix 1A 

Descriptive Statistics for the 25 Physical Fitness Variables: Means and Standard Deviatio ns For Six Groups — 

Boys Girls 
Variables Age = 9 Age= 12 Age= 15 Age= 9 Age= 12 Age= 15 





Mea n 


sp 


Mea n 




SD 


Mean 




SO 


Mean 


SD 


Moan 


SP 


Mean 


SD 


V02M 


.39 . 


94 


.26 




84 


- 


79 




87 


-. 


52 


.73 




50 


.92 


-.62 


.78 


1 . 6K Run 


.07 . 


87 


.45 




75 




87 




69 




67 


.96 




33 


.86 


-.41 


.96 


PWC 17 0 


- .54 


47 


.15 




66 


1 . 


57 




89 




91 


.40 




28 


.57 


.13 


.63 


BOM Dash 


- .50 


75 


.22 




76 


1 . 


20 




.64 




89 


.87 




04 


.75 


.18 


.78 


liOna JunriD 


-.56 


. 62 


.16 




75 


1 . 


3 7 




.83 




84 


.65 




1 6 


.72 


.16 


.71 


S i tup 


- .43 


.75 


.39 


1 . 


08 




81 




.99 




52 


.70 




09 


.89 


-.12 


.87 


Pu ^huo 

rUDlIU fyf 


. 1 9 


. 7 6 


.32 




87 


1 


00 




.91 




29 


.76 


- 


51 


.84 


-.78 


.74 


Rt Grin 


- .75 


. 38 


- . 01 




57 


1 


57 




.87 




97 


.34 


- 


12 


.52 


.51 


.53 




- .72 


. 37 


. 03 




58 


1 


60 




.89 




95 


.34 


- 


17 


.51 


.44 


.53 


Shld Pull 


- .70 


. 39 


- . 03 




59 


1 


49 


1 


.06 


- 


87 


.34 


- 


17 


.50 


.54 


.64 


Shld Push 


- .84 


.30 


- .21 




58 


1 


.46 


1 


.02 


- 


79 


.34 


- 


.02 


.53 


.68 


.58 


Leg 


- . 66 


.42 


.15 




.63 


1 


.58 




.95 


- 


.87 


.38 


- 


.23 


.53 


.22 


.62 


Sit /reach 


- . 41 


.76 


- .53 




.86 


- 


.13 


1 


.07 




.14 


.83 




.23 


.92 


.81 


.96 


Sys to lie 


- . 49 


.86 


- . 09 




.89 




.81 




.94 


- 


.46 


.89 


- 


.02 


.90 


.37 


.87 


D i. a s t o 1 ic 


-.24 1 


.05 


-.14 




.99 




.29 




.95 


- 


.16 


.97 




.03 


.99 


.29 


.90 


FVC 


- . 7b 


.38 


.01 




.55 


1 


.54 




.84 


-1 


.00 


.37 


- 


.09 


.53 


.57 


.55 


FEV1 


- . 77 


.38 


- .05 




.54 


1 


.48 




.85 


- 


.99 


.38 


- 


.05 


.55 


.66 


.58 


Arm i y- 1 - V\ 
/ V X. ill \J 1 J L. I 1 


_ m 79 


. 68 


- . 01 




. 87 




.96 




.83 




.67 


.67 




.03 


.76 


.71 


.78 


Waist Grth 


-.63 


.69 


.16 




.97 




.91 




.78 




.73 


.72 




. 08 


.84 


.51 


.88 


Hip Girth 


-.91 


.57 


- .09 




.75 




.88 




.70 




.85 


.60 




.17 


.71 


1 .01 


.71 


Skinfoldl 


-.41 


.73 


- .1 1 


1 


. 06 




.58 




.73 




.15 


.87 




.21 


.92 


.82 


1.03 


Skinfold2 


-.35 


.75 


- .11 


1 


.01 




.56 




.73 




.21 


.92 




.28 


.99 


.55 


1 .12 


Skinfold3 


-.46 


.66 


-.10 


1 


.12 




. 14 




.78 




.02 


1.01 




.17 


1 .03 


.63 


1 .00 


Skinfold4 


-.42 


.70 


.02 


1 


. 19 




.13 




.92 




.02 


1.00 




. 15 


1 .00 


.47 


.90 


Skinfolds 


- .46 


.78 


- .01 


1 


.14 




. J4 




.87 




.04 


.98 




.18 


1 .00 


.56 


.90 



Note: See Table 1 for a description of the variables. All variables were standardized (Mcan=0 SD=1) across the 
total group. Thus, for example. V0 2 max for 9-ycar-old boys is .39 (total group) standard deviations above the mean 
for the total group, and the scores are slightly less variable (SD'=.94 vs. 1.0 for the total group). Times for the 
running measures (1.6K run and 50M dash) were multiplied by -1 so that positive scores reflected more positive 
fitness. 
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^scriplive Statistics for the 25 Physical Fitness Variab ly: Correlations for Three Groups of Bovs 
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41 


41 
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2S 
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3S 
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20 
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1 1 


HH 
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40 


39 


38 


36 


23 


06 


26 


1 2 


39 


31 


1 
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-34 


23 
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29 
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3S 


33 


32 


20 
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25 


lb 


41 


33 


Wl 


1 
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27 
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02 
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40 


39 
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20 


-08 


31 
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48 


40 


M 


82 


1 


-3-1 


-45 
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-28 


-32 
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33 
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23 


17 


20 


10 


-12 


23 


14 


24 


20 


7 5 


71 


76 


-25 
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-23 


-20 


13 


-34 


22 


21 
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20 
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20 
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12 


74 
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09 
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4 6 
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38 
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01 
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10 


42 
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-41 


31 


-20 
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13 
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40 


4 0 
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34 


2 8 
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18 
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4 2 


Hip Girth 
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-36 


3 6 
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3 8 






4 7 


44 
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02 


28 
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Skinfoldl 
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07 
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20 


2 0 
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OH 
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If 
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21 


1<1 


1 H 


17 


1 4 


• 0 6 


11 


0 5 


19 



1 
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1 
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7H 
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70 
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8 6 


74 


8 7 


64 
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67 


86 
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SEX- BOYS AGE= 15 
V02M 1 
16K Run 
PWC170 
50M Dash 
Long Jump 
Si tup 
Pushup 
Rt Grip 
Lft Grip 
Shld Pull 
Shld Push 
Leg 

Sit /reach 
Systol ic 
Diastol ic 
FVC 
FEVl 

Arm Girth 
Waist Grth 
Hip Girth 
Skinfoldl 
Skinfold2 
Skinfold3 
Skinfold4 

Skinfold 
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16 
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49 
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24 
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24 


19 


40 


39 
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-15 


06 


51 


44 


40 


04 


3 0 
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04 


SO 


41 


38 


0 5 
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87 
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-19 


12 


44 


41 


40 


06 


2 3 


63 


61 
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07 


49 


34 


3 7 


08 


1 6 


57 


56 


66 


1 










-03 


10 


45 


3 6 


38 


13 


2 6 


64 


67 


55 


5 2 


1 








07 


08 


17 


21 


33 


10 


2 2 


31 


31 


26 


28 


23 


1 






-20 


-06 


11 


12 


10 


01 


06 


27 


24 


20 


20 


2 0 


20 


1 




OS 


-05 


09 


-01 


-03 


-02 


00 


18 


15 


11 


1 1 


14 


09 


43 


1 


-12 


1 1 


57 


35 


38 


04 


12 


62 


61 


58 


59 


56 


28 


27 


1 5 


-09 


10 


53 


37 


36 


04 


14 


58 


56 


55 


58 


55 


26 


25 


18 


-25 


-17 


32 


14 


IS 


-13 


02 


57 


56 


52 


45 


47 


18 


30 


10 


-10 


-21 


31 


01 


00 


-17 


-23 


40 


43 


37 


36 


36 


04 


27 


11 


-27 


-19 


35 


08 


06 


-18 


-15 


48 


49 


45 


41 


38 


11 


31 


14 


-22 


-41 


-08 


-34 


-32 


-23 


-34 


01 


02 


05 


02 


01 


-09 


19 


06 


-07 


-43 


-14 


-40 


-31 


-22 


-36 


-07 


-06 


-02 


-04 


-07 


-10 


13 


07 


-15 


-37 


02 


-2 4 


-24 


-24 


-33 


14 


15 


12 


07 


08 


-01 


22 


11 


-09 


-30 


-02 


-21 


-23 


-19 


-32 


09 


10 


10 


06 


05 


-04 


25 


14 


-15 


-36 


-01 


-24 


-25 


-29 


-37 


11 


10 


10 


07 


07 


-06 


24 


07 



90 


1 














4 6 


42 


1 












46 


39 


7 8 


1 










47 


44 


76 


81 


l 








-01 


-03 


55 


63 


56 


1 






-11 


-13 


43 


49 


47 


81 


1 




08 


05 


60 


73 


66 


76 


75 


1 


10 


06 


53 


68 


57 


78 


65 


79 


n 


0? 


5? 


74 


63 


84 


70 


83 



1 

83 1 



Note. All correlations are presented without decimal points. See Table 1 for a description or the van 
The actual analyses were conducted on covariance matrices that can be constructed (by LiSREL) from the 
Q correlations presented here and the standard deviations presented at the start of this appendix. 
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SEX-GIRLS AGK-9 
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Rt Grip 
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03 
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Lit Grip 


08 


09 
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03 
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Shld Pull 


-01 


1 6 


2 9 


3 1 


30 


1 9 


1 4 


4 8 


4 7 


1 






















Shld Push 


07 


1 3 


2 1 


28 


30 


2 4 
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4 4 


58 
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1 6 


2 2 


27 


1 9 


-01 


43 


46 


3 2 


31 
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Sit /roach 


-05 
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09 
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24 
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3 6 


43 


37 


04 


21 


1 6 
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Hip Girth 
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